Book Reviews: Prosody and Speech Recognition

نویسندگان

  • Alex Waibel
  • Joan Bachenko
چکیده

chart parsing is reduced into a set of matrix operations dealing with sparse matrices. Appendices list sample grammars and lexicons, which brings substance to the claims. "Speech" in this book refers to English only, which is never made explicit. This seems to be the normal case in American literature, however. Of course, most of the contribution is relevant to other languages as well. The book also provides an interesting contribution in the area of finite-state properties of language, because the phrase structure grammars used are essentially finite-state. Other finite-state accounts (such as the two-level model by Koskenniemi [1984] and cascaded transducers by Kaplan and Kay [Kay 1983] seem to have been less successful in combining structural information with segmental processes. Both other models are purely segmental, although syllables are sometimes referred to as contexts. An interesting problem concerning rule interaction in the proposed formalism is dealt with on page 113. There would be an obvious need for subtraction (for defining negative contexts) and intersection (combining effects). Subtraction, however, turns out to exclude too much, whereas intersection is too permissive. The book is well written and the argumentation proceeds logically. Both strong and weak points of the theories proposed are clearly presented. It gives a fair overall picture of the field of speech recognition, and much of the book could be suitable as a textbook. Nevertheless, some passages address mostly readers with a considerable background. The main topic covers, of course, a specific slice of the whole field, namely the treatment of allophonic variation. One minor inconvenience is the use of a reference format that cites a number only, not the author and the year. This results in a small savings in space but a larger burden for the reader. The book appears to be Church's (previously unpublished) doctoral dissertation from MIT, though this is not clearly indicated in the volume. Although not particularly new, it still is very valuable.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effects of Culture and Gender on the Recognition of Emotional Speech: Evidence from Persian Speakers Living in a Collectivist Society

This paper reports on a behavioral study that explores the role of culture and gender in the recognition of emotional speech in an under investigated cultural context (a collectivist society: i.e., Iran). Participants were asked to recognize the emotional prosody of a set of validated emotional vocal portrayals (including the five basic emotions). Findings of the experiment were then comp...

متن کامل

Extraction and Representation of Prosody for Speaker, Speech and Language Recognition

Thank you very much for downloading extraction and representation of prosody for speaker speech and language recognition. As you may know, people have look hundreds times for their chosen novels like this extraction and representation of prosody for speaker speech and language recognition, but end up in malicious downloads. Rather than reading a good book with a cup of tea in the afternoon, ins...

متن کامل

Automatic Building of Synthetic Voices from Audio Books

Current state-of-the-art text-to-speech systems produce intelligible speech but lack the prosody of natural utterances. Building better models of prosody involves development of prosodically rich speech databases. However, development of such speech databases requires a large amount of effort and time. An alternative is to exploit story style monologues (long speech files) in audio books. These...

متن کامل

A Factored Language Model for Prosody Dependent Speech Recognition

Prosody refers to the suprasegmental features of natural speech (such as rhythm and intonation) that are used to convey linguistic and paralinguistic information (such as emphasis, intention, attitude, and emotion). Humans listening to natural prosody, as opposed to monotone or foreign prosody, are able to understand the content with lower cognitive load and higher accuracy (Hahn, 1999). In aut...

متن کامل

Speech Recognition with Word Fragment Detection Using Prosody Features for Spontaneous Speech

This investment proposed a novel approach for word fragment detection with prosody features for spontaneous speech recognition. Incomplete pronunciation of word result in ill-form fragment in word-building that causes the performance of language model in speech recognition is dramatically decreased. Instead of lexical word, prosody word is used to be building block for spontaneous speech proces...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002